From decoding-driven to detection-based paradigms for automatic speech recognition

نویسنده

  • Chin-Hui Lee
چکیده

We present a detection-based automatic speech recognition (ASR) paradigm that is capable of integrating both the knowledge sources accumulated in the speech science community and the modeling techniques established in the speech processing community. By exploring this new framework, we expect that researchers in the Interspeech community can collaboratively contribute to developing next generation algorithms that have the potential to surpass current capabilities, and go beyond the limitations of the state-of-the-art ASR technologies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Noise-Robust Speech Recognition Through Auditory Feature Detection and Spike Sequence Decoding

Speech recognition in noisy conditions is a major challenge for computer systems, but the human brain performs it routinely and accurately. Automatic speech recognition (ASR) systems that are inspired by neuroscience can potentially bridge the performance gap between humans and machines. We present a system for noise-robust isolated word recognition that works by decoding sequences of spikes fr...

متن کامل

Designing and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods

For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Audio-visual speech recognition system for a robot

Automatic Speech Recognition (ASR) for a robot should be robust for noises because a robot works in noisy environments. Audio-Visual (AV) integration is one of the key ideas to improve its robustness in such environments. This paper proposes AV integration for an ASR system for a robot which applies AV integration to Voice Activity Detection (VAD) and speech decoding. In VAD, we apply AV-integr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004